Exploring Dimensionality Reductions with Forward and Backward Projections

نویسندگان

  • Marco Cavallo
  • Çagatay Demiralp
چکیده

Dimensionality reduction is a common method for analyzing and visualizing high-dimensional data across domains. Dimensionality-reduction algorithms involve complex optimizations and the reduced dimensions computed by these algorithms generally lack clear relation to the initial data dimensions. Therefore, interpreting and reasoning about dimensionality reductions can be difficult. In this work, we introduce two interaction techniques, forward projection and backward projection, for reasoning dynamically about scatter plots of dimensionally reduced data. We also contribute two related visualization techniques, prolines and feasibility map, to facilitate and enrich the effective use of the proposed interactions, which we integrate in a new tool called Praxis. To evaluate our techniques, we first analyze their time and accuracy performance across varying sample and dimension sizes. We then conduct a user study in which twelve data scientists use Praxis so as to assess the usefulness of the techniques in performing exploratory data analysis tasks. Results suggest that our visual interactions are intuitive and effective for exploring dimensionality reductions and generating hypotheses about the underlying data.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Clustrophile: A Tool for Visual Clustering Analysis

While clustering is one of the most popular methods for data mining, analysts lack adequate tools for quick, iterative clustering analysis, which is essential for hypothesis generation and data reasoning. We introduce Clustrophile, an interactive tool for iteratively computing discrete and continuous data clusters, rapidly exploring different choices of clustering parameters, and reasoning abou...

متن کامل

A Visual Interaction Framework for Dimensionality Reduction Based Data Exploration

Dimensionality reduction is a common method for analyzing and visualizing high-dimensional data. However, reasoning dynamically about the results of a dimensionality reduction is difficult. Dimensionality-reduction algorithms use complex optimizations to reduce the number of dimensions of a dataset, but these new dimensions often lack a clear relation to the initial data dimensions, thus making...

متن کامل

A Random Forest Classifier based on Genetic Algorithm for Cardiovascular Diseases Diagnosis (RESEARCH NOTE)

Machine learning-based classification techniques provide support for the decision making process in the field of healthcare, especially in disease diagnosis, prognosis and screening. Healthcare datasets are voluminous in nature and their high dimensionality problem comprises in terms of slower learning rate and higher computational cost. Feature selection is expected to deal with the high dimen...

متن کامل

Forward and Backward Uncertainty Quantification in Optimization

This contribution gathers some of the ingredients presented during the Iranian Operational Research community gathering in Babolsar in 2019.It is a collection of several previous publications on how to set up an uncertainty quantification (UQ) cascade with ingredients of growing computational complexity for both forward and reverse uncertainty propagation.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1707.04281  شماره 

صفحات  -

تاریخ انتشار 2017